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© Process for making genes encoding random polymers of amino acids. 



G) A method for making a synthetic gene encoding a random polymers -of amino-acids but which has 
fedete^S am ino^acid constiLnts. comprises the polymerisation of small oligonucleotide duplexes. 

^e^ovel s^thesis can also be used as part of a method for identifying amino-acid polymers wrth b.olog.cal 
and/or immunological activity. 
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PROCESS FOR MAKING GENES ENCODING RANDOM POLYMERS OF AMINO ACIDS 



Background of the Invention 



Copolymer 1 (COP-1) is a synthetic polypeptide analog of myelin basic protein (M8P). which is a 
natural component of the myelin sheath. It has been suggested as a potential therapeutic agent for multiple 
sclerosis (Eur. J. Immunol. [1971] 1:242; and J. Neurol. Sci. [1977] 31:433). Interest in COP-1 as an 
immunotherapy for multiple sclerosis stems from observations first made in the 1950s that myelin 
components such as MBP prevent or arrest experimental autoimmune encephalomyelitis (EAE). EAE is a 

disease resembling multiple sclerosis that can be induced in susceptible animals. 

COP-1 was developed by Drs. Sela, Amon, and their co-workers at the Weizmann Institute (Rehovot, 
Israel) It was shown to suppress experimental allergic encephalomyelitis (EAE) (Eur. J- Immunol. [1971] 
1:242-248- U S Patent No. 3,849,550). More recently. COP-1 was shown to be beneficial for patients with 
the exacerbating-remitting form of multiple sclerosis (N. Engl. J. Med. [1987] 317:408-414). Patients treated 
with daily injections of COP-1 had fewer exacerbations and smaller increases in their disability status than 

the control patients. ._•*_•! i 

COP-1 is a mixture of polypeptides composed of alanine, glutamic acid, lysine, and tyrosine In a molar 
ratio of approximately 6:2:5:1, respectively. It is synthesized by chemically polymerizing the four amino 
acids forming products with average molecular weights of 23,000 daitons. Although the resulting polypep- 
tides are comprised of the same amino acid components, they differ with respect to their amino acid 
sequences In fact, there are 10 100 possible ways to assemble a 23,000 dalton polypeptide composed of 
alanine, glutamic acid, lysine and tyrosine in the designated ratios. Purification of one or even a small 
number of distinct COP-1 polypeptides from chemically-synthesized COP-1 is not possible. 

Studies evaluating COP-1 's efficacy have been hindered somewhat by inconsistent batches of COP-1. 
Also it is not known which of the amino acid polymer(s) is responsible for the biological activity of COP-1. 
Other random sequence amino acid copolymers related to COP-1 have been chemically, synthesized and 
tested for the ability to suppress experimental allergic encephalomyelitis (Eur. J. Immunol. [1973] 3:273:; 
Immunochemistry [1976] 13:333). Biological activity was observed in EAE assays using COP-1 related 
polymers in which one of the following changes occurs: tyrosine is replaced by tryptophan; or glutamic acid 
ao is replaced by aspartic acid; or tyrosine is excluded. 

We have developed procedures for synthesizing genes encoding polypeptides composed of specific 
amino acids but having random amino acid sequences. The amino acid composition of the polypeptides is 
dictated by the set of codons incorporated in the synthetic genes. Likewise, the size of the polypeptides is 
controlled by synthesizing genes of specific lengths. 
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Brief Summary of the Invention 

The subject' invention concerns' a method for synthesizing genes which encode random polymers of 
amino acids A further aspect of the invention is the identification of certain polypeptides which are 
expressed by the synthetic genes and which have high levels of biological activity. The general method- 
oloqy of the subject invention is outlined in Figure 1. . . , .. 

A critical step in the novel process of the subject invention is the polymerization of small 
oligonucleotide duplexes. Preferably, the oligonucleotide duplexes consist of a multiple of three^nucleotides. 
The lenqth of the synthetic genes can be controlled through the use of adaptors spec.fic for the 5 and 3 
ends of the oligonucleotides. Further, the composition of the resultant polypeptides can be varied, with 
respect to the relative proportions of the amino acid constituents, by varying the proportions of input 
oliaonucleotide duplexes. m m . . ^ . ^ _ 

The synthesis of polypeptides similar to COP-1 exemplifies the procedures of the subject invention. The 
initial step in the procedure is the synthesis of genes which code for polypeptides consisting of predeter- 
mined amino acid constituents. These genes are then cloned in an expression vector and introduced Into E. 
coli such that each recombinant bacterial colony contains one COP-1 gene. To generate a m.xture of COP-1 
HoTypeptldes (analogous to the chemically synthesized product) we produce COP-1 polypeptides from a 
pool of recombinant bacterial colonies containing COP-1 gene sequences, e.g.. 1000 colon.es. 
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The efficacy of the pool of recombinant COP-1 polypeptides is tested in experimental allergic 
enceohalomyelitis (EAE) assays. If effective, the pool of colonies exhibiting activity is further subdivided 
ST Pods o^ 100 colonies), and the polypeptides from these smaller pools are tested. By sequentially 
fractionating and selecting the most active pools, we Identify individual recombinant COP-1 polypeptides or 
5 srS TgroSL of polypeptides with biological activity in EAE assays equal to or higher than chermcally 
Sestoed COP-1. The opportunity to characterize homogeneous, individual COP-1 polypeptides .s unique 

toth £e P sS5i invention also concerns the synthetic genes and the polypeptides produced by the 
methods disclosed herein. Advantageously, the procedures of the subject invention can be used to produce 

10 polypeptides which may be useful in preventing, arresting, or controlling demyelinating borders such as 
Srsderosis. A preferred copolymer according to the subject invention cons.sts substantiaMy Of 
II lysine, and either glutamic or aspartJc acid. Polymers of any length or mo ^ular w*ght «" be 
synthesized using the procedures of the subject invention. Another preferred copolymer further Indudes 
eXer tyrosine or tryptophan. More specifically, a preferred copolymer may cons.st of alanine, lysine. 

,s gluSnic Tcid. and tyrosine, and have a molecular weight between about 5.000 and 50,000 daltons. Further, 
the method of the subject invention can also be used to make fusion proteins. 



Brief Description of the Drawings 
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Figure 1 depicts the general methodology for synthesizing random genes and identifying polypep- 
tides with biological activity. . 

Figure 2 shows one specific strategy for synthesizing genes encoding random-sequence polypep- 

25 tides. ^ ^ poss . b|e 3 _ amino acid combiria tions and their percent occurrence In COP-1. 

Figure 4 shows 9-nucleotide duplexes and adaptors for COP-1 gene synthesis. 
Figure 5 shows the construction and cloning of synthetic COP-1 genes. 

Figure 6 provides the sequence analysis of one synthetic gene revealing proper junctions between 

so du P |exe . S g ure 7 js a Western Dtot showing the fusion protein produced by four different dories. 
Figure 8 shows variations of random-gene synthesis using small duplexes. 

Figure 9 shows synthesis of single-stranded DNA encoding random sequence amino acid P°'V merS : 
Figure 10 shows a phosphoramidite trinucleotide for random gene synthesis using a ONA syn- 

35 thesizer. 

Figure 11 shows the DNA and amino acid sequences of rCOP-1-77. 
Figure 12 shows the DNA and amino acids sequences of rCOP-1-19. 

Figure 13 shows the average EAE scores of disease-induced guinea pigs which were untreated or 
treated with myelin basic protein, rCOP-1-19. or rCOP-1-77. 
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Detailed Description of the Invention 



One strategy for synthesizing genes encoding random-sequence polypeptides is outl ned I in R9«« 2. 
The example IMustrates the synthesis of genes composed of two DNA duplexes. Oligonucleotides compns- 
£ tte duplexes^ synthesLd and annealed. Each DNA duplex has the same "sticky ends" ^emed 
by X and X ! Tn Figure 2. The duplexes are mixed together, sticky ends align, and the ends are joined In an 
somatic reaction producing long segments of DNA (genes). Since the sticky ends nan -ch duplex arejhe 
«m« the duolexes can align and ligate In any order. Thus, a senes of genes composed of the same 
dXeTbutt ^inTorde'rs are produced. The polypeptides encoded by these genes will have sim.lar 

oreviously tTconstmct genes encoding specific proteins. Rrst. random sequence genes are synthesized by 
«!«mblina small oligonucleotide duplexes because more sequence variation is produced by mixing the 
orte c J SaTrle C^e duplexes. In contrast, long oligomers of 30 to 60 nucieotides are employed 
^n^esTze aTene e ncodi 9 ng a defined amino add sequence. Secondly, to construct random-sequence 
oer^s tite sticky enSs o TeacS duplex must be Identical so that the duplexes can be joined together In any 
o'er Z SnZ co'espondlng to defined amino acid sequences, the duplexes must iigate in a fixed order. 
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To achieve this ordering, the sticky ends at each junction must be unique. Thus, the process of the subject 
invention is unique in that random-sequence genes are synthesized using oligonucleotide duplexes 
encoding small segments of amino acids, and the sticky ends on each duplex are the same. 

Our procedure for synthesizing genes encoding recombinant COP-1 polypeptides entails using 

5 oligonucleotide duplexes encoding segments of 3 amino acids. All possible permutations of the 3 amino 
add segments comprised of the four COP-1 amino acids and their percent occurrences in COP-1 are 
shown in Figure 3. To make recombinant COP-1 genes we have synthesized oligonucleotides correspond- 
ing to the coding and noncoding strands for some of the 3 amino acid segments (Figure 4). The 
oligonucleotides are phosphorylated at the 5' ends. Complementary pairs of oligonucleotides are annealed, 

70 forming duplexes with the 3' nucleotide extending on each strand: adenosine on the coding strand, and 
thymidine on the noncoding strand. Adenosine and thymidine base pair with one another thus ensuring that 
the duplexes are joined directionally, that Is. coding strands to other coding strands. Since the duplexes 
have the same nucleotide extensions, they can align in any order. When duplexes corresponding to all 
sixty-four 3-amino acid blocks are mixed and ligated. COP-1 genes with all possible sequences are 

75 produced. . * , / 

To control the length of the synthetic genes, we have included adaptors specific for the 5 and 3 ends. 
One strand of each duplex adaptor is not phosphorylated (see Figure 4). As a result, ligation products 
terminate at the hydroxylated ends. By varying the ratio of the adaptor duplexes in the reaction, we can 
control the length of the synthetic genes. The adaptors serve the second function of adding specific 
extensions required to directionally clone the ligation products into the vector (Figure 5). The relative 
proportions of amino acid constituents in the random polymers can also be modulated by mixing the 
oligonucleotide duplexes in different ratios. For example, duplexes can be added to ligations in proportions 
dictated by the percent occurrence of the corresponding amino acid segments in COP-1 (Figure 3). 

The number of different duplexes incorporated determines the sequence complexity of the synthetic 
genes For example. COP-1 genes can be constructed using fewer than 64 duplexes but not all sequence 
combinations will occur. The amount of sequence complexity required will depend on the application of the 

P ° ly One tl complication arises In producing completely random COP-1 amino acid sequences. For duplexes 
to have the same extensions, the 3* nucleotide on the coding strands must be trw same. Codons for 

so alanine, glutamic acid, and lysine can end with adenosine; however, for tyrosine, the lastinucleot.de is either 
cvtidine or uridine. Thus, duplexes encoding three amino acid segments ending in tyrosine (fourth column 
on Figure 3) will have different extensions than the other duplexes. This limitation can be overcome by 
making a second noncoding strand for each duplex with an extension of guanosine or adenosine. Because 
this solution requires significantly more DNA synthesis, we have elected to exclude duplexes corresponding 

35 to three amino acid segments ending in tyrosine. 

Once the synthetic genes are made, they are cloned into a suitable expression vector and transferred to 
a host capable of expressing the polypeptides. The host may be bacterial or eukaryotlc cells. With bacteria, 
cells are grown under conditions that permit formation of recombinant colonies. Each colony will contain 
and express one synthetic gene. The polypeptides expressed by culture from specific colonies are Isolated 

40 and tested for the relevant biological or immunological activity. For example. COP-1 polypeptides are tested 
for the ability to suppress EAE. 

To screen a large number of polypeptides for activity, it may be advantageous to pool the polypeptides 
before testing. Pools of polypeptides can be generated by either combining colonies and isolating i the 
DolvDeptides produced by the mixed culture or by culturing individual colonies, isolating the polypeptides 

45 from each culture, and then combining the purified polypeptides. By testing pools of polypeptides, it Is 
oossible to more quickly determine which of the colonies express biologically or .mmunolog.cally active 
polypeptides. For example. If polypeptides from 100 colonies do not exhibit activity, these colonies i can be 
eliminated from further investigation. If. on the other hand, the mixture of polypeptides does exhibit the 
activity, then sequential subsets of the pooled polypeptides can be tested until the active 

so polypeptides, or mixture of polypeptides, is Identified. ,. . ^ a 

Tests for some biological properties may be performed directly on the colon.es. thus eliminating the 
" need for isolating polypeptides. For example, reactivity of polypeptides with antisera can be assessed in 
colonies grown and lysed on nitrocellulose filters. 
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Synthesis and P hosphorylation of Oligonucleotides . The coding and noncodmg ol, 9°" u ^ d " 
eac h duplex a reTy nthesized by the phosphite triester method with an Applied Biosystems Mode 1 380 I DNA 
synthesizer. The 5' ends of oligonucleotides are phosphorylated on the DNA synthesizer using p-p-p.* - 
dimethoxytrityloxy)ethylsulfonyl)ethyl-(2-cyanoethyl)N.N-diisopropyl) phosphoramidite from Glen Research. 

5 Hem p d u°rfflMtion of Oligo nucleotides. The phosphorylated form of the oligonucleotide is separated from the 
C rudTrMre~b7electrophoresis through a 20% acrylamide gel containing 7 M urea. Oligomers are eluted 
from excised gel slices and desalted on Sep-Pak C-18 cartridges. Separation of the S phosphorylated 
oligomer from hydroxylated forms is critical because hydroxylated oligomer causes the ligation reactions to 
10 terminate prematurely, yielding very small products. 

Anneallno and Ligation of Oligonuc leotides . Mixtures containing 1 nmol each of a coding and com- 
plem-e75t5r7Ton^ding-ltrand. 100 mM Tris-HCI. pH 7.6. and 0.1 mM ethylenediaminetetraacetic acid 
(EDTA) in 50 microliters are heated to BO'C in a 600 ml water bath. The ™ c « on *"°f°^* c °* 
slowly (2 hours) to room temperature. The temperature is decreased to 4 C over one hour by add.ng ice to 
,5 ^e water bath. For ligation, equal aliquots of the annealed dup.exes are mixed and the ^'^"'^^ 
to contain 10 mM MgCI,. 1 mM adenosine triphosphate (ATP). 1 mM DTT and 1 5 % P^*^«^- 
The final concentration of Tris-HCI is 66 mM (from the annealing reactions). The total concentration of 8- 
^ mer duplexes is 10 pmol/micro.iter. Annealed adaptors are included in the reactions at a "ncentrafcon 

^ whTch will produce ligation products of the correct size and with termini that are compatible w.th sites ,n the 

cloning vector. To obtain a maximum yield of synthetic genes within the correct size range. a i senes of 
ligation reactions are set up with increasing ratios of adaptor duplex:9-mer duplex (e g.. 1:50 1.15C ). 1.300). 
The reactions yielding products in the desired size range are pursued. We have determined ttiat add.ng 1 
Jmo. of each adaptor duplex to 50-300 pmo. of 9-mer dup.exes yields products oj 400-60 .base pa r*. 
Eigation reactions are In 75 microliters with 600 units of T4 DNA ligase (New England Biolabs. Beverly 
MA) T^e reaction proceeds at 16* C for 16-20 hours. After ligation, polyethylene glycol is removed by 
fraction with 3 volumes of chloroform. The products are concentrated by ethanol preap.tat.on and 

resuspended in 10 microliters. , „ 

Size Selection of Ligation Products . The concentrated reaction products are electrophoresed on 4% 
Muslim GTrearo-senFMC Bio^rodUcts. Rockford. ME) gels. Products are detected by staining wUh 
ethidfum bromide and the region corresponding to the desired size range (400-600 nucleoUdes) Is excised. 
Sy^Hc genes of 400-600 nucleotides encode polypeptides of 15.000-23.000 daltons. We haye selected 
genes of approximately this size because COP-1 polypeptides within this range were prev,ously tested In . 
chemical trials. The agarose plug containing the synthetic DNA Is stored at -20 C. 

35 

Preparation of Expression Vector 

The pREV 2.1 plasmid car, be constructed from a plasmid pBG1. Plasmid pBGt can be >*°**4*™ 
^ its E coli host by well known procedures, e.g.. using cleared lysate-isopycroc density gradient P™^™; 

^ and-Ske Plasmid pBG1 was deposited in the E. coll host MS371 with the Northern Regional Research 
Laboratory (N^R^ U.S Department of AgrlcultureTp-Soria. Illinois. U.S.A) on November 1. 1984 and was 
assigned ^hl accession nunTber NRRL B-15904. pREV 2.1 was constructed from plasm d ^press^ vector 
JhEV M L Uke pBG1, pREV2.2 expresses inserted genes behind the E. coli promoter. The d.fferences 
between pBG1 and pREV2.2 are the following: " 
m 1 dREV2 2 lacks a functional replication of plasmid (rop) protein. 

. .2. pREvK has the trpA transcription terminator Inserted into the Aatll site. This sequence insures 

transcription termination of over-expressed genes. .ui nramn honl CQ l whereas dBG1 

3. P REV2.2 has genes to provide resistance to ampic.ll.n and chloramphenicol, wnereas ptsui 

provides resistance only to ampicillln. -i M .U 
4 PREV2.2 contains a sequence encoding sites for several restncton endonucleases. 
The following procedures were used to make each of the four changes listed above: 

1a 5 <Tg * plasmid P BG1 was restricted with Ndel. which gives two fragments of approximately 2160 

~* ^oTugofONA from the digestion mixture, after inactivation of the Ndel, was 
Hgase under conditions that favor intramolecular ligation (200 ul reaction volume "^J^"*~ 
ration conditions [New England Biolabs. Beverly. MA]). Intramolecular ngation of the 3440 base pair 
S^t^^JSSbwSslatant plasmid. The ligation mixture was transformed Into the recipient strain 
T^WOZ lJ^^ New P Eng.and Biolabs) and ampicillln resistant clones were selected by 
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standard procedures. . ,3^^ 

1c. The product plasmid, pBG1 N. where the 2160 base pair Ndel fragment is deleted from pBGI, 
was selected by preparing plasmid from ampicillin resistant clones and determining the restriction digestion 
patterns with Ndel and Sail (product fragments approximately 1790 and 1650). This deletion inactivates the 
rop gene that controls plasmid replication. 

— 2a. 5 ug of pBG1 N was then digested with Eco RI and Bc1 l and the larger fragment, approximately 
2455 base pairs, was Isolated. 

2b. A synthetic double stranded fragment was prepared by the procedure of Itakura et al. (Itakura, K., 
J J. Rossi, and R.B. Wallace [1984] Ann. Rev. Biochem. 53:323-356, and references therein) with the 
following structure: 

5' GATCAAGCTTCTGCAGTCGACGCATG 

3' TTCGAAGACGTCAGCTGCGTACGCCT 

AGGCCATGGGCCCTCGAGCTTAA 5' 



CGGATCCGGTACCCGGGAGCTCG 3' 



This fragment has Bc1l and EcoRI sticky ends and contains recognition sequences for several restriction 

endonucleases. . 

2c. 0.1 ug of the 2455 base pair EcoRI-Bc1 1 fragment and 0.01 ug of the synthetic fragment were 
joined with T4 DNA Iigase and competent cells of strain JM103 were transformed. Cells harboring the 
recombinant plasmid, where the synthetic fragment was inserted into pBG1 N between the Bell and EcoRI 
sites, were selected by digestion of the plasmid with Hpa! and EcoRI. The diagnostic fragment sizes are 
approximately 2355 and 200 base pairs. This plasmid is called pREV1. 

2d. 5 ug of pREV1 were digested with Aatll. which cleaves uniquely. 

2e. The following double-stranded frag ment was s ynthesized: 
5 CGGTACCAGCCCGCCTAATGAGCGGGCI I II I I I I GACGT 3, 
Z TGCAGCCATGGTCGGGCGGATTACTCGCCCGAAAAAAAAC 5 

This fragment has Aatll sticky ends and contains the trpA transcription termination sequence. 

2f. 0.1 ug ofAatll digested pREV1 was ligated with 0.01 ug of the synthetic fragment in a volume of 
20 ul using T4 DNAligase. 

2g. Cells of strain JM103, made competent, were transformed and ampicillin resistant clones 

selected. ■■ 
2h. Using a Kpni. EcoRI double restriction digest of plasmid isolated from selected colonies, a cell 
containing the cor reH" collection was isolated. The sizes of the Kpni, EcoRI generated fragments are 
approximately 2475 and 80 base pairs. This plasmid is called pREVITT and contains the trpA transcription 

terminator. t Kt . . 

3a. 5 ug of PREV1TT, prepared as disclosed above (by standard methods) was cleaved with Ndel 
and Xmnl and the approximately 850"base pair fragment was Isolated. . 

"~3b. 5 ug of plasmid pBR325 {BRL, Gaithersburg, MD), which contains the genes confemng 
resistance to chloramphenicol as well as to ampicillin and tetracycline, was cleaved with.BclJ and the ends 
blunted with Kienow polymerase and dexoynucleotides. After inactivating the enzyme, the-mixture was 
treated with Ndel and the approximately 3185 base pair fragment was isolated. This fragment contains the 
Genes for chioTamphenicol and ampicillin resistance and the origin of replication. 

3c 0.1 ug of the Ndel-Xmnl fragment from pREVITT and the Ndel-Bcll fragment from pBR325 were 
ligated Jn 20 u! with T4"T5NA~figase and the mixture used to transform i_ competent : cells of strain JM103. 
Ceils resistant to both ampicillin and chloramphenicol were selected: - - - - 

3d Using an EcoRI and Ndel double digest of plasmid from selected clones, a plasmid was selected 
giving fragment sizeTct approxirrTately 2480. 1145. and 410 base pairs. This is called plasmid pREV1TT/chl 
and has genes for resistance to both ampicillin and chloramphenicol. 

4a, The following double-stranded fragment was synthesized: 
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Mlul EcoRV Clal BamHI 

5' CGAACGCGTGGCCGATATCATCGATGG 

3' GCTTGCGCACCGGCTATAGTAGCTACC 
Sail HindUI Smal 
ATCCGTCG ACAAG CTTCCCG G GAG CT 3' 
TAGGCAGCTGTTCGAAGGGCCC 5' 

This fragment with a blunt end and an SstJ sticky end, contains recognition sequences for several 

restriction pR^ 1TT/ch , was cleaved with Nrul (which cleaves about 20 nucleotides from the Bc1_l 

site) and Sstl (which cleaves within the multiple cloning site). The larger fragment, approxtmately 3990 base 
pairs. -^ d c ^ m e ^^^ ment ^ pREV lTT/ch. and 0.01 ug of the synthetic fragment were 

treated with T4 DNA llgase in a volume of 20 ul. 

4d This mixture was transformed Into strain JM103 and amplcillin resistant clones were selected. 

4e" Plasmid was purified from several clones and screened by digestion with Mlul or Clal. 
Recombinant clones with the new muttiple cloning site will give one fragment when digested w.th erther of 
these enzymes, because each cleaves the plasmid once. „ laBm iH 

4f The sequence of the multiple cloning site was verified. This was done by restarting the P'asmld 
with Hpal and Pvutl and isolating the 1395 base pair fragment, cloning It into the Smal site of mp18 and 
sequencing it bydideoxynucleotide sequencing using standard methods. 

' 4a This plasmid is called pREV2i. 
Plasmid pREV 2.2 can be isolated from its E. coll host by well known procedures. This plasmid was 
deposited in the E. coli host JM103 with the Northern Regional Research Laboratory (NRRL, U.S. 
Department of AgrirultuTi". Peoria. Illinois. USA) on July 20. 1986 and was assigned^ accession number 
iippi q 10091 

Plasmid pREV 2.1 was constructed using plasmid pREV 2.2 and a synthetic oligonucleotide. An 
fivamole of how to construct pREV 2.1 is as follows: 

example "^"^ ^ £ deaved witn rest riction enzymes Nrul and BamHI and the 4 Kb fragment .s 

isolated from an agarose gel. 

2. The following double-strand oligonucleotide is synthesized: 

5' CGAACGCGTGGTCCGATATCATCGATG 3' 
3' GCTTGCGCACCAGGCTATAGTAGCTACCTAG 5' 

3. The fragments from 1 and 2 are ligated in 20 ul using T4 DNA ligase. transformed into competent 
E coll cells and chloramphenicol resistant colonies are isolated. • 

4 Plasmid clones are identified that contain the oligonucleotide from 2. spanning the region from the 

Nrul site to the BamHI site and recreating these two restriction sites. This plasmid isr termed pREV _2 1. 

— Cloning the Tig-ation Products into an Expression Vector. The expression vector is ^ted to yield 
exten^o^thlTaTe^patiDl^^ termini of the synthetic genes. The agarose plug 

^wno the size-selected synthetic genes is melted at 65-70' C and transferred to 37 C. Nine microliters 
.rmeSaoaroS ^iSng Cgenes is mixed with one microliter of digested vector (20 ng). The mixture 
J ad ^TSTEX VhCI. PH 7.5. 5 mM MgC, 5 mM DTT and 1 ATP and Is diluted £ 
20 microliters T4 DNA ligase (400 units) is added and the reaction is Incubated overnight at 16 C. 
Competent Sis ^are transformed with the ligation mixtures and recombinants are identified by selection on 

^ST^Z^ i^ate procedure, including the best mode, for prating the 
Invention. These examples should not be construed as limiting. All percentages are by weight and all 
solvent mixture proportions are by volume unless otherwise noted... 
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Example 1 - Strategy for Synthesis of Random Sequence Genes 

Model studies using several oligonucleotide duplexes were performed to assess this method for 
synthesizing genes encoding polypeptides of predetermined amino acid composition but random se- 
5 quences. The synthetic genes were analyzed with respect to size, ligation junctions, composition, sequence, 
and levels of expression. 

Size. Synthetic genes within broad size ranges are produced by varying the ratio of adaptors to 9-mer 
duplexes. We are able to select genes within more limited size ranges by resolving the ligation products on 
agarose gels and excising gel slices containing products of a certain length. Using these procedures, genes 
to in the following three size ranges have been isolated: 75-150, 280-320. and 400-600 nucleotides. 

Ligation junctions . The synthetic genes have been sequenced to demonstrate that the 9-mer duplexes 
are joined end to end and without insertion or deletion of any nucleotides. Correct junctions are necessary 
for maintaining the reading frame and thus producing genes that encode polypeptides of the expected 
amin0 acid com position. Sequence analysis revealed that the junctions between the duplexes are correct 
is (Figure 6). 

Composition. To control the amino add composition of the encoded polypeptides, the synthetic genes 
must contain the duplexes added to the ligation reactions. The results from three gene synthesis 
experiments demonstrated that the synthesized genes are composed of the duplexes included in the input 
steps of the synthesis. 

Sequence. Since each duplex can ligate to any other, the order of the 9-mer duplexes In the synthetic 
genes should be random. This randomness is demonstrated in the Example shown in Figure 6. The 
synthetic gene shown in this example is composed of duplexes encoding the following amino acid 
segments: KKA, EAE, KAK and YKK. 

Expression. To measure expression levels, synthetic genes are cloned into vectors such that the 
polypeptides are expressed either as fusions with heterologous vector-derived peptide sequences, or as 
nonfusion polypeptides. Trie levels of expression of the fusion products can be readily measured by 
Western blot analysis using antisera directed against the vector-derived portion of the fusion protein. Figure 
7 demonstrates the expression of four CO P-1 -containing fusion polypeptides. 

\ 

Example 2 - Variations for Synthesizing Random Sequence Genes with Small DNA Duplexes 

DNA duplexes of other lengths can be used in an approach similar to that described for duplexes of 9 
nucleotides. Since three nucleotides code for one amino acid, strands of 3. 6. 12. 15, or 1 8 nucleotides can 
be annealed forming duplexes that code "for small blocks of amino acids (see Figure 8). More sequence 
variation occurs by mixing small rather than large duplexes. 

In addition, terminal extensions of more than one nucleotide can be employed- Figure 8 shows an 
example using duplexes of 12 nucleotides and extensions of 3 nucleotides. XXX represents any codon; the 
codons in the duplexes are varied to produce polypeptides of the desired amino acid composition. This 
approach restricts the polypeptide sequences since the amino acid encoded by the duplex junctions - 
alanine (Ala) in this example - is repeated every fourth amino acid. Also, in another variation of the subject 
invention, extensions of S nucleotides can be employed instead of the 3' extensions illustrated throughout 
this report 

Example 3 - Synthesis of Single-Stranded Random Sequence Genes 
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Gene synthesis using DNA duplexes results in double stranded genes that are ready for cloning into an 
expression vector. An alternative strategy entails producing single-stranded, random sequence genes. The 
application of this method to COP-1 is illustrated in Figure 9. Three-nucleotide oligomers corresponding to 
codons for each of the amino acids in COP-1 are synthesized, mixed in appropriate ratios, and chemically 
polymerized in solution to produce long single-stranded COP-1 genes. The complementary strand of DNA 
is made enzymaticaJly using reverse transcriptase or DNA polymerase. The double-stranded DNA is 
prepared for cloning by digestion and repairing the ends of the molecules. 
55 Single-stranded, random sequence genes could also be made by performing the polymerization step on 
a DNA synthesizer. Typically, synthetic DNA is assembled one nucleotide at a time using phosphoramidite 
nucleotide precursors. We developed a strategy for synthesizing single-stranded DNA In three nucleotide 
segments ("codons") by using phosphoramidite trinucleotides (Figure 10). The use of 3-nucleotide building 
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blocks instead of single nucleotides Is necessary to ensure that only specific codons. those ex.rrespond.ng 
TTe p?osphoramidlte trinucleotides, would occur In the synthetic genes. To. test th.s strategy we 
commissioned the custom synthesis of a phosphoramidite trinucleotide. We observed polymerization ol the 
trinucleotide on the DNA synthesizer. 

Example 4 • Expression of Random Sequence Genes In Vector Producing Fusion Proteins 

Synthetic random sequence genes can be cloned into gene fusion vectors so that the expressed 
polypeptides are comprised of a vector-derived polypeptide linked to the random sequence P^P^e. 
?he appifcation of this method to COP-1 is described here. Synthetic COP-1 genes are cloned into the 
expression vector denoted pREV 2.1 within the polylinker s.te. Upon expression from pRev 2.1. a 
oofypTptide is synthesized comprising the amino-terminal portion (approximately 25 to appropriately 45 
amino acids) of the bacterial protein linked by a peptide bond to the COP-1 polypeptide. 

We have teSed twelve different COP-1 genes for expression in pRev 2.1. Fusion polyposes were 
found to be expressed from ten of the twelve constructs. Rgure 7 is a Western blot ^monsfratng £e 
Sslon proteins produced from four different clones, as detected by binding of antrsera speafic for the 
bacteria, cordon oTthe fusion protein. Detection of fusion proteins in the expected size range us.ng anfsera 
specific for ^e vector-derived portion is dependent on the presence of a COP-1 gene sequence s.nce the 

^O^TfTz XSLn polypeptides comprised o, 34 amino acids of the vectored 
orotein (approximately 3.900 daltons) and approximately 130-200 amino acids encoded by the random 
Senc ^genes (15.000-23.000 daltons). Based on migration through SDS gels, the mo ecu.ar we.ghts of 
me Sn Pofypeptides are in the correct range. Clone 4 produces lower leve.s of severai smaller 
nnivneDtides which may be generated upon degradation of the largest species. 

acids which are at the Junction of the vecto~.er.ved and COP-1 Polypeptides are encoded 
bv the 5' oligonucleotide adaptor duplex. The adaptor duplex can be designed to encode a meth.on.ne 
fesWue between toe bacterial protein and the COP-1 sequences. In this case, the COP-1 polypeptides can 
be felLse??rom me fusion protein by treatment with cyanogen bromide which cleaves on the carboxyi 
fermSa. s.de of methionine residues. Both forms of COP-1 (the fusions and the free polypeptides) are 

teS 1n d Sd«onT P R^. other fusion vectors can be emp.oyed to express COP-1 fusion polypeptides 
and othe'r s^ategie's can be empioyed to reiease the COP-1 polypeptides from ^ 
example, to improve the expression of rCOP-1 polypeptides In E col. genes cod.ng rC °^J^ 
?COP-1- 9. were subcloned from pREV 2.1 to pBG3-2AN. a plasmid used to express Protein A. PBG3-2AN 
has been deposited as described in U.S. Patent No. 4.691.009. The deposit was made on "™™*"J*\ 
given the accession number of NRRL B-15910. The rCOP-1 genes were isoiated frorr , the pREV 
2 1 recombinant plasmids by digestion with Nco1 and EcoR1. The Nco1 site occurs * fte 5 Hnker 
e^poyeoTn clonmg the rCOP-1 genes and thelcoRI site is in pREV2.1 downstream of Urn ^rCOP-1 gene. 
Atter dtoestion JSi the restriction enzymes. The ends of the rCOP-1 genes are blunted wHh deo~ 
ySonSeotldes and Klenow Segment DNA fragments containing the rCOP-1 

Larose oels pBG3-2AN is digested wtih Nhe1, treated wrth phosphatase and the ends o the DNA are 
Sd wlm deoxyribonucleotides and Kleno—fragment. After ligation and p.ating. pBOMAN ^^.nants , 
bearinaTcOP-1 genes in the correct orientation are identified by DNA sequence analysis. The resuming 
orimids encode fusion proteins consisting of /3-glucuronldase. Protein A. and rCOP-1 sequences A 
SnL resfdue occurs" between the Protein A and rCOP-1 sequences, originating from ^5 linker 
"equ n !in order that the COP-1 polypeptide may be cleaved from the ^J^'^^IS^ 
amino acid sequences for rCOP-1-77 and rCOP-1-19 are Shown in Figures ^JJ'^ 9 ^^ 0 ^ 
19 contains oligonucleotide duplexes encoding the following amino aad segments: YKK AAE. W EKA 
I^YEA AKA KEA. and KA* rCOP-1-77 contains oligonucleotide duplexes encod.ng the follow.ng am no 
-a^se^enS-'YiSt EAE. KAK AAK, and AAA. The N-termlna. alanine residue In each sequence is left 
behind following CNBr cleavage of the fusion protein. 

The invention should not be limited to the examples descnbed above. 

Example S - Expression of Random Sequence Genes in Non-Fusion Vectors 

Syndetic genes can be cloned Into expression vectors such that the polypeptide products are not fused 
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to vector-derived protein sequences. As In fusion vectors, these non-fusion vectors contain the appropnate 
transcriptional and translational signals; however, the synthetic genes are linked directly adjacent to the 
translation initiation signal such that they contain a methionine residue as the amlno-termlnal amino aad. 
This single methionine can be removed from COP-1 polypeptides by cyanogen bromide cleavage. COP-1 
polypeptides with and without this amino-terminal methionine are tested for biological activity. 



Example 6 - Purification of rCOP-1 Polypeptides 

The Durification of rCOP-1 polypeptides can be accomplished by a number of methods which are well 
known to those skilled in this art. For example. E. coll cells expressing Protein A/rCOP-1 iusion proteins are 
orown in a fermenter. collected by centrifugation. and lysed using a dynamill. The extract is centnfuged to 
remove debris, adjusted to contain 8 M urea, and chromatographed on an S-Sepharose column using a 
sodium chloride gradient for elution. Fractions containing the fusion protein are dialyzed agamst a solution 
of alvcerol in phosphate buffered saline; the dialysate Is centnfuged to remove contaminating proteins that 
precipitate during dialysis. The Protein A/rCOP-1 fusion protein is cleaved with cyanogen bromide and the 
rCOP-1 polypeptide is purified by gel filtration and reverse phase HPLC. 



Example 7 - EAE Experiments 

rCOP-1 has been tested for efficacy in suppressing experimental allergic encephalomyelitis (EAE). As 
described above. EAE is a T-cell mediated autoimmune disease that is employed as a model for the human 
disease multiple sclerosis. EAE experiments are performed essentially as described by Swanborg 
rswanbom RM [19881 "Experimental Allergic Encephalomyelitis." In Methods In Enzymology. vol. 162. p. 
413 Academic Press. Inc.). For example, the disease Is Induced in Hartley guinea pigs by a smgle. 
subcutaneous injection of 10 ug of guinea pig myelin basic protein in Freund's adjuvant cont^ning 100 ug 
of Mycobacterium t uberculosis . Onset of disease occurs about 12 to 20 days after induction. The disease is 
sc ored on a scale of 0-4: 0 * no disease; 1 = loss of coordination In hind limbs; 2 = s paralysis of one or 
both hind limbs; 3 = paralysis extending to include one or both front limbs, can include Incontinence of 
bladder or bowel; and A = extensive paralysis, inability to move. Animals are scored eve ^3 days from 
the onset of disease, and most animals spontaneously recover from the disease. The treatmerrt protocol 
consiS of intramuscular Injections of 500 mg of test material at 1. 6. and 11 days ^ ^ucUo^ 
disease The dosage, route of administration, and schedule for treatments can be vaned. Also. EAE 
experiments can be performed in other species including rats and mice. Other variations of the exper.men- 
tal nrotocols and scoring may be used. . 

Two rCOP-1 molecules. rCOP-1-77 and rCOP-1-19. have recently been tested In the EAE expenments. 
The oroduction and purification of these molecules were performed in accordance with the procedures 
described in Examples 4 and 6. Guinea pigs treated with rCOP-1-77 or rCOP-1-19 were compared to 
animals treated with myelin basic protein, a positive control, and to an untreated group The 9«ph 'n I Rgure 
13 shows the average EAE scores for each treatment group versus days after Induction. In Table 1. several 
aspects of disease, such as incidence, day of onset, maximum severity, and duration, are compared. 

Table 1. 





Effects on rCOP-1 and MBP on EAE 




Group Duration 2 


Incidence 1 


Day of Onset 2 


Max. ! 


Severity 2 


Untreated 

rCOP-1-19 
r COP- 1-77 

Myelin Basic Protein 


7/7 
8/8 
7/8 
8/8 


13.5 ±1.0 (13-15) 
16.8*3.7(13-25) 

17.6 ±2.8 (15-22) 
18.1 ±2.6 (15-20) 


2.3 ± 0.6 (1-3) 
2.8 ± 1.4 (1-4) 
2.9±J.O (1-4) 
2.3 ±1.1 (1-4) 


10.8 * 3.e (4-15) 
8.6 3 * 5.2 (2-15) 
9.9 ±4.5 (2-15) 
5.3*5.5 (2-15) 



1 Incidence =* Number with ctsease/numoer lesiea. 

2 Values are the mean standard deviation. Ranges are in parentheses. 

3 One animal in this group died. 
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• «j ^ii^^. rr op-1 -77 and rCOP-1-19 both delayed the onset of disease 
The results can b % summ ^ " ^er^ £e SoM moiecu cs <J d not affect other measures of 
using the 2^,3^ Sr or ". Mye>in b™.c protein delayed onset and decreased 
f ease-mc. de nce ; ■J^^^^L s J 9ritv or Incid P e „ C e. One note about this particular 
frilS is S me m Jmum severity for the untreated animals (2.3) is unusually low; in a pilot 
SSS with^guin^ pigs, the severity scores were 2 and 4. It may be possible to op*™ the 
effects of rCOP-1 by varying the treatment procedure. 

Fv^mnte 8 - Other Applications -for Random Sequence Polymers of Amino Adds 

I « cSion of rescues to cysteic acid, changes that produce undesirable effects on hair. 

tion is as supplements for diets deficient in certain amino acids. 

\ 

30 Claims 

xiss s^sa pSa: ~ - — ~ i< " ywM " h * v * 
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predetermined amino acid constituents, said method comprising the synthesis of genes encoding said 
random polymers, said synthesis comprising the polymerization of small oligonucleotide duplexes, said 
method for making polypeptides further comprising the cloning of said synthetic genes into an expression 
vector and transferring said vector Into a bacteria or eukaryotic cell capable of expressing said polypeptide. 

6 15. A method, according to claim 14. wherein said bacteria is an Escherichia coll. 

16^ A method for making a fusion polypeptide having one portion comprised o1 all or part of a 
heterologous polypeptide and a second portion comprised of a random sequence wherein said random 
sequence portion has predetermined amino acid constituents, and said method comprises the synthesis of 
genes encoding said random polymers, said synthesis comprising the polymerization of small 

70 oligonucleotide duplexes, said method further comprising the cloning of said synthetic genes into an 
expression vector adjacent to a DNA sequence encoding ail or part of said heterologous polypeptide and 
transferring said vector into a bacteria or eukaryotic cell capable of expressing said fusion polypeptide. 

17. A method for synthesizing and Identifying a biologically or Immunologically active polypeptide, or 
mixture of polypeptides, said method comprising the following steps: 

75 (a) synthesizing genes encoding random polypeptides by polymerization of small oligonucleotide 

duplexes; . , 

(b) cloning each of said synthetic genes into a vector and transferring said vector into a host capable 

of expressing said polypeptides; 

(c) growing said hosts under conditions which permit the formation of recombinant colonies which 
20 each express one recombinant gene; 

(d) testing the polypeptide, or a mixture of the recombinant polypeptides, combined either before or 
after isolation from said colonies, for evidence of biological and/or immunological activity; 

(e) where activity is observed for the mixture of polypeptides, generating smaller subsets of that 
combination and testing each of these subsets for biological activity; and 

25 (f) repeating step (e) until the active component(s) or a suitable mixture thereof is obtained. 

18. A synthetic gene which codes for rCOP-1-19. 

19. A synthetic gene which codes for rCOP-1-77. 



\ 



30 



35 



40 



45 



SO 



55 

f 



12 



EP 0 383 620 A2 



Strategy for Synthesizing Random Genes and 
Identifying Recombinant Polypeptides with Biological Activity 



Figure 1 



vVvw 



vww 



Ilgate 



mixture of 

oligonucleotide 

duplexes 




vAAAA/ 

NMW 



random sequence 
synthetic genes 



VNA^AyS/ 



clone in an expression vector 
transform into host 



recombinant colonies, 
each contains one 
synthetic gene 



identiiication of biologically active polypeptides 

I 



f 

- pool colonies (e.g.1000) 
y isolate polypeptides 
lest tor biological activity 

fractionate Into smaller pools of 
colonies (e.g. 10 pools of 100 
colonies) 



select individual colonies 
isolate the polypeptide 
test tor biological activity 



EP 0 383 620 A2 



Oligonucleotides 
(9-mers) 



anneal 



Duplexes 



II gate 



Synthetic 
Genes 




expression 



Polypeptides 



Ata-AI^Ata^ 

Ala-Aia-Ala-Ala-Ala-Ala-Lys-Lys-Lys-Lys-Lys 



Ly S -Lys-Lys-Ala-Al^^^ 
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Figure 3 



All Possible 3 Amino Acid Combinations 
and Their Percent Occurence in Cop 1 



AAA 


7 .872 


AEA 


2*624 


ANA 


6.560 


AY A 


I .312 


E A A 


2.624 


EEA 


0,875 


l£KA 


2. 187 


t'lA 


0.4 37 


KAA 


6.540 


KEA 


2. 1U7 


KKA 


5.466 


KYA . 


1 .093 


YAA 


1 .3112 


YEA 


0.437 


YKA 


1 .093 


Y YA 


0.2L9 



AAE 


2.624 


AEE 


0.875 


AKE 


2.187 


AYE 


0 .437 


EAE 


0.875 


EEE 


0.292 


EKE 


0 .729 


EYE 


0.146 


KAE 


2. 187 


KEE 


0 .729 


KKE 


1.B22 


K YE 


0.364 


YAE 


0 .437 


YEE 


0.146 


YKE 


0.364 


YYE 


0.073 



AAK 


6 .560 


AEK 


2. 187 


AKK 


5. 466 


AYK 


1 .093 


EAK 


2. 187 


EEK 


0 .729 


EKK 


1 . 822 


EYK 


0.364 


KAK 


5.466 


KEK 


1 . B22 


KKK 


4 .555 


KYK 


0.911 


YAK 


1 . 093 


YEK 


0.364 


YKK 


0.911 


YYK 


0. 182 



AAY 


1.312 


AEY 


0 .437 


AKY 


1 .093 


AY Y 


O .219 


EAY \ 0.437 


EEY 


0. 146 


EKY 


0.364 


EY Y 


0 .073 


KAY 


1 .093 


KEY 


0.364 


KKY 


0.911 


KYY 


0 • 1B2 


YAY 


0.219 


YE Y 


0.073 


YKY 


0 . 182 


YYY 


0 » 036 



A=Alanine 
E=Glutamic Acid 
K=Lysine 
Y— Tyrosine 
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Figure A 

9-Nucleotide Duplexes and Adaptors 
for Cop 1 Gene Synthesis 



Examples of 9-Nucleotide Duplexes for Cop 1 Genes: 

© ® 

Coding r«» TaGAAGGCA 6AAG CA6AA 

Noncoding (3' end )TTTCTTCCG T C T T C G T C T 

© ® 

v K • A E A E 

Amino acids k n « 



Adaptors: 



i i GATC 
For the 5' end ^ 



(BamH I) ^ 



® 

For the 3' end T AGCT - 



(Sac I) 



# 
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Figure 5 

Construction and Cloning 
Synthetic COP 1 Genes 



<«j> 



5' adaptor 



GATC- 

* - 

(BamH I) 



GATC- 
(BamH iy 



9-mers 



ligase 



3' adaptor 



• AGCT 



1j> 



(Sac I) 



•AGCT, 



(Sac I) 



size-select genes 



GATC- 



clone into vector 



^ AGCT 



Synthetic Gene 



"^v Plasmid Vector 
CTAG TCG A "\ \ cut with BamH I 

/ I and Sac I 
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Figure 6 

Sequence Analysis of a 
Synthetic Gene 



aminoaclds: Y K KE A E K A K 

nucleotides: TACAAQAA a|j AAGCAGAa|^AGGCTAAA| 

aminoacids: Y KKY KKK KA 

nucleotides: T AC A AS A A a[t A C A AG A A A^ A G A AG G C a| 



amino acids: 



K K 



nucieotides: GAAGCAGA A'JaA GAAGGCApAAGGAGAAj 



amino acids: 



E E S A 



nucleotides: A A G A A G G C a|g A A G C A G A a|g A A G C A G A A | 

amino acids: K K A K A K E A E 
nucleotides: A A G A A G G C a|a A G G CT A A a[3 A A G C A G A A | 



amino acids: 



amn.u~.u-. K K A K K . A K A * 
nucleotides: A A G A A G G C a|a AG A A G G C a|a A G G CT A A A | 



The lines in the nucleotide sequence mark the junctions 
between 9-mer duplexes. 



Amino Acids: 
As alanine 
Kslysine 
E=gluiamic acid 
Y=tyrosinc 



• 
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Figure 7 
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Figure 8 

Variations of Random-Gene Synthesis 
using Small Duplexes 



One Nucleotide Extension: 



3-mers GCA GAA 
3-mers jcG TCT 



GCAAAA GAAGCA 
6-mers JCGTTT TCTTCG 



ho mfl « GCAGAAGCAAAA 
12-mers TCGTCTTCGTTT 



\ 



Three Nucleotide Extensions. 12-mer Duplexes: 



duplex XXXXXXXXXGCA 

CGTXXXXXXXXX 



llgate 



none XXXXXXXXXGCAXXXXXXXXXGCAXXXXXXXXXGCA 
9 CGTXXXXXXXXXCGTXXXXXXXXXCGTXXXXXXXXX 



i 



polypeptide Ala aa aa aa Ala aa aa aa Ala aa aa aaa Ala 



xxx= any codon 

aa» amino acid specified 

by codon xxx 
Ala« alanine 



m 
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Figure 9 

Synthesis of Single-Stranded DNA Encoding 
Random Sequence Amino Acid Polymers 



3 Nucleotide Segments for COP I 

Ala 



5 GCA 3 ' 



ratio: 



Glu 
5 GAA 3 ' 



Lys 



5 AAA 
5 



3' 



Tyr 

5 TAC 3 



Polymerize 

A. Solution Chemistry 
or B. DNA Synthesizer 



Single-Stranded DNA 



Double-Stranded DNA 



5 f 



5* 
3 f 



\ 



-3' 



Enzymatic Synthesis 
of the Second Strand 



.3' 
.5 f 



Clone into 
Expression Vector 
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DMTr 
I 

O 
I 

CH 



Figure 10 

Phosphoramldlte Trinucleotide for Random 
Gene Synthesis Using a DNA Synthesizer 



H 2 ?1 



Phosphoramidite 
Trinucleotide 



o=p— o 




o=p— o 




CN -CH 2 -CH 20 N ~ CH (CH 3) 2 
CH (CH3)2 



Phosphoramidite 
Mononucleotide 



DMTr 

? 

? 

CHaO ivi — cH(CHa)a 
CH(CHa)2 
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FIGURE 11 
rCOP-1-77 



A V K X r A , V X K T « « * * « * * * * 

ff^C ---"^""^cGTTrrTTCCGAITTXTGTTCTTT 



CGTCTTA.TGTTCTTTTTCCGATTTCTTCGTCTTTTCCGTTTTT 
A EYKKKAK 



EAEKAKKAKV KK 



TACAAGAAAGAAGCAGAAGCTGCTAAAGCTCCTAAAGCTGCTGCAGCTG 
+ t -r:::- GM i rcGA cGAITTCGACGACGTCGACGACGTATGTTC 



,,EAAKAAKAAAAAAVK 

+ Z1Z.Z.ZZvnru~ tirTTTTCC GATTTATGTT CTTTTT C CGiATTTCTT 



TTTCTTCGTCTTCGACGACGTCTTCGTCTTTTCu 

EAEKAKYKKKAKE 

A E Y K K K A K A A A E A E Y K K Z A E 
GAAGCAGAATACAAGAAATACAAGAAAAAGGCTAAAAAGGCTAAA 

K Y K X K A-.K K A K Y 1 

;aagcagaatac 



v »TrXAKYK KKA 
EAEYKKYKKK A-K 



^GAACCAGAAAAC-GCTA^^ 
K E A E K A K A A A E A E K A K = . A E Y 

aagaaatacaagaaagaagcagaaaaggctaaagaagcagaat 

SciSA^C^ScGiSiTTCCGATTTCTTCGTCTTATT 
KKYKKEAEKAKEAE * 



• # 
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FIGURE 12 
rCOP-1-19 




CGTTTCCGACGTCTCTTTCGTTTCCGAC _ _ ^ K A K 

A K A A E K A 



KAAKKAYSA- 



A K A A x. « »rca.GCTGCTGAA 



^.KKAEKAEKAEKAA 
A K Y E A K » » rrfcJVGCAGAGAAAGCAAAGGAA 

axc^c^aa^^^ 



K A E K A K A A£^AKAA ¥ ^ K 
GCGAAAGCAAAGGCTGCATAA 
• CGCTTTCGTTTCCGACGTATT 



